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Abstract 

We study the problem of obtaining lower bounds for polynomial calculus (PC) and polynomial 
calculus resolution (PCR) on proof degree, and hence by [Impagliazzo et al. ’99] also on proof size. 
[Alekhnovich and Razborov ’03] established that if the clause-variable incidence graph of a CNF 
formula F is a good enough expander, then proving that F is unsatisfiable requires high PC/PCR 
degree. We further develop the techniques in [AR03] to show that if one can “cluster” clauses and 
variables in a way that “respects the structure” of the formula in a certain sense, then it is sufficient 
that the incidence graph of this clustered version is an expander. As a corollary of this, we prove that 
the functional pigeonhole principle (FPHP) formulas require high PC/PCR degree when restricted to 
constant-degree expander graphs. This answers an open question in [Razborov ’02], and also implies 
that the standard CNF encoding of the FPHP formulas require exponential proof size in polynomial 
calculus resolution. Thus, while Onto-FPHP formulas are easy for polynomial calculus, as shown 
in [Riis ’93], both FPHP and Onto-PHP formulas are hard even when restricted to bounded-degree 
expanders. 


1 Introduction 

In one sentence, proof complexity studies how hard it is to certify the unsatifiability of formulas in con¬ 
junctive normal form (CNF). In its most general form, this is the question of whether coNP can be 
separated from N P or not, and as such it still appears almost completely out of reach. However, if one 
instead focuses on concrete proof systems, which can be thought of as restricted models of (nondeter- 
ministic) computation, then fruitful study is possible. 

1.1 Resolution and Polynomial Calculus 

Perhaps the most well-studied proof system in proof complexity is resolution [Bla37], in which one 
derives new disjunctive clauses from a CNF formula until an explicit contradiction is reached, and for 
which numerous exponential lower bounds on proof size have been shown (starting with [Hak85, Urq87, 
CS88]). Many of these lower bounds can be established by instead studying the width of proofs, i.e., 
the size of a largest clause appearing in the proofs, and arguing that any resolution proof for a certain 
formula must contain a large clause. It then follows from a result by Ben-Sasson and Wigderson [BWOl] 
that any resolution proof must also consist of very many clauses. Research since [BWOl] has led to a 
well-developed machinery for showing width lower bounds, and hence also size lower bounds. 

The focus of the current paper is the slightly more general proof system polynomial calculus resolu¬ 
tion (PCR). This proof system was introduced by Clegg et al. [CEI96] in a slightly weaker form that is 
usually referred to as polynomial calculus (PC) and was later extended by Alekhnovich et al. [ABRW02]. 

*This is the full-length version of the paper with the same title to appear in Proceedings of the 30th Annual Computational 
Complexity Conference (CCC ’15). 
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In PC and PCR clauses are translated to multilinear polynomials over some (fixed) field F, and a CNF 
formula F is shown fo be unsatisfiable by proving fhat the constant 1 lies in the ideal generated by the 
polynomials corresponding to the clauses of F. Here the size of a proof is measured as the number 
of monomials in a proof when all polynomials are expanded out as linear combinations of monomials, 
and the width of a clause corresponds to the (total) degree of the polynomial representing the clause. 
Briefly, the difference between PC and PCR is that the latter proof system has separate formal variables 
for positive and negative literals over the same variable. Thanks to this, one can encode wide clauses into 
polynomials compactly regardless of the sign of the literals in the clauses, which allows PCR to simulate 
resolution efficiently. With respect to the degree measure polynomial calculus and polynomial calculus 
resolution are exactly the same, and furthermore the degree needed to prove in polynomial calculus that 
a formula is unsatisfiable is at most the width required in resolution. 

In a work that served, interestingly enough, as a precursor to [BWOl], Impagliazzo et al. [IPS99] 
showed that strong lower bounds on the degree of PC proofs are sufficient to establish strong size lower 
bounds. The same proof goes through for PCR, and hence any lower bound on proof size obtained via a 
degree lower bound applies to both PC and PCR. In this paper, we will therefore be somewhat sloppy in 
distinguishing the two proof systems, sometimes writing “polynomial calculus” to refer to both systems 
when the results apply to both PC and PCR. 

In contrast to the situation for resolution after [BWOl], the paper [IPS99] has not been followed by 
a corresponding development of a generally applicable machinery for proving degree lower bounds. For 
fields of characferisfic distincf from 2 if is somefimes possible to obtain lower bounds by doing an affine 
transformation from {0,1} to the “Fourier basis” {—1, +1}, an idea that seems to have appeared first 
in [Gri98, BGIPOl]. For fields of arbitrary characteristic Alekhnovich and Razborov [AR03] developed 
a powerful technique for general systems of polynomial equations, which when restricted to the standard 
encoding of CNF formulas F yields that polynomial calculus proofs require high degree if the corre¬ 
sponding bipartite clause-variable incidence graphs G{F) are good enough expanders. There are many 
formula families for which this is not true, however. One can have a family of constraint satisfaction 
problems where the constraint-variable incidence graph is an expander—say, for instance, for an unsat¬ 
isfiable set of linear equations mod 2—but where each constraint is then translated into several clauses 
when encoded into CNF, meaning that the clause-variable incidence graph G{F) will no longer be ex¬ 
panding. For some formulas this limitation is inherent—it is not hard to see that an inconsistent system of 
linear equations mod 2 is easy to refute in polynomial calculus over F 2 , and so good expansion for the 
constraint-variable incidence graph should not in itself be sufficient to imply hardness in general—but 
in other cases it would seem that some kind of expansion of this sort should still be enough, “morally 
speaking,” to guarantee that the corresponding CNF formulas are hard. * 


1.2 Pigeonhole Principle Formulas 

One important direction in proof complexity, which is the reason research in this area was initiated 
by Cook and Reckhow [CR79], has been to prove superpolynomial lower bounds on proof size for 
increasingly stronger proof systems. For proof systems where such lower bounds have already been 
obtained, however, such as resolution and polynomial calculus, a somewhat orthogonal research direction 
has been to try to gain a better understanding of the strengths and weaknesses of a given proof system 
by studying different combinatorial principles (encoded in CNF) and determining how hard they are to 
prove for this proof system. 

*In a bit more detail, what is shown in [AR03] is that if the constraint-variable incidence graph for a set of polynomial 
equations is a good expander, and if these polynomials have high immunity—i.e., do not imply other polynomials of signif¬ 
icantly lower degree—then proving that this set of polynomial equations is inconsistent in polynomial calculus requires high 
degree. CNF formulas automatically have maximal immunity since a clause translated into a polynomial does not have any 
consequences of degree lower than the width of the clause in question, and hence expansion of the clause-variable incidence 
graph is sufficient to imply hardness for polynomial calculus. Any polynomial encoding of a linear equation mod 2 has a low- 
degree consequence over F 2 , however—namely, the linear equation itself—and this is why [AR03] (correctly) fails to prove 
lower bounds in this case. 
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It seems fair to say that by far the most extensively studied such combinatorial principle is the pigeon¬ 
hole principle. This principle is encoded into CNF as unsatisfiable formulas claiming that m pigeons can 
be mapped in a one-to-one fashion into n holes for m > n, but there are several choices exactly how to 
do this encoding. The most basic pigeonhole principle (PHP) formulas have clauses saying that every 
pigeon gets at least one pigeonhole and that no hole contains two pigeons. While these formulas are 
already unsatisfiable for m > n -|- 1, they do not a priori rule out that there might be “fat” pigeons re¬ 
siding in several holes. The functional pigeonhole principle (FPHP) formulas perhaps correspond more 
closely to our intuitive understanding of the pigeonhole principle in that they also contain functionality 
clauses specifying that every pigeon gets exactly one pigeonhole and not more. Another way of making 
the basic PHP formulas more constrained is to add onto clauses requiring that every pigeonhole should 
get a pigeon, yielding so-called onto-PHP formulas. Finally, the most restrictive encoding, and hence 
the hardest one when it comes to proving lower bounds, are the onto-FPHP formulas containing both 
functionality and onto clauses, i.e., saying that the mapping from pigeons to pigeonholes is a perfect 
matching. Razborov’s survey [Raz02] gives a detailed account of these different flavours of the pigeon¬ 
hole principle formulas and results for them with respect to various proof systems—we just quickly 
highlight some facts relevant to this paper below. 

For the resolution proof system there is not much need to distinguish between the different PHP 
versions discussed above. The lower bound by Haken [Hak85] for formulas with m = n -|- 1 pi¬ 
geons can be made to work also for onto-FPHP formulas, and more recent works by Raz [Raz04a] and 
Razborov [Raz03, Raz04b] show that the formulas remain exponentially hard (measured in the number 
of pigeonholes n) even for arbitrarily many pigeons m. 

Interestingly enough, for polynomial calculus the story is very different. The first degree lower 
bounds were proven by Razborov [Raz98], but for a different encoding than the standard translation 
from CNF, since translating wide clauses yields initial polynomials of high degree. Alekhnovich and 
Razborov [AR03] proved lower bounds for a 3-CNF version of the pigeonhole principle, from which 
it follows that the standard CNF encoding requires proofs of exponential size. However, as shown 
by Riis [Rii93] the onto-FPHP formulas with m = n -|- 1 pigeons are easy for polynomial calculus. 
And while the encoding in [Raz98] also captures the functionality restriction in some sense, it has re¬ 
mained open whether the standard CNF encoding of functional pigeonhole principle formulas translated 
to polynomials is hard (this question has been highlighted, for instance, in Razborov’s open problems 
list [Raz 15]). 

Another way of modifying the pigeonhole principle is to restrict the choices of pigeonholes for each 
pigeon by defining the formulas over a bipartite graph H = {U UV,E) with |?7| = m and \ V\ = n and 
requiring that each pigeon u ^ U goes to one of its neighbouring holes in N (u) C V. If the graph H 
has constant left degree, the corresponding graph pigeonhole principle formula has constant width and a 
linear number of variables, which makes it possible to apply [BWOl, IPS99] to obtain exponential proof 
size lower bounds from linear width/degree lower bounds. A careful reading of the proofs in [AR03] 
reveals that this paper establishes linear polynomial calculus degree lower bounds (and hence exponential 
size lower bounds) for graph PHP formulas, and in fact also graph Onto-PHP formulas, over constant- 
degree expanders H. Razborov lists as one of the open problems in [Raz02] whether this holds also for 
graph FPHP formulas, i.e., with functionality clauses added, from which exponential lower bounds on 
polynomial calculus proof size for the general FPHP formulas would immediately follow. 

1.3 Our Results 

We revisit the technique developed in [AR03] for proving polynomial calculus degree lower bounds, 
restricting our attention to the special case when the polynomials are obtained by the canonical translation 
of CNF formulas. 

Instead of considering the standard bipartite clause-variable incidence graph G{F) of a CNF for¬ 
mula F (with clauses on the left, variables on the right, and edges encoding that a variable occurs in a 
clause) we construct a new graph G' by clustering several clauses and/or variables into single vertices, 
reflecting the structure of the combinatorial principle the CNF formula F is encoding. The edges in this 
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new graph G' are the ones induced by the original graph G{F) in the natural way, i.e., there is an edge 
from a left cluster to a right cluster in G' if any clause in the left cluster has an edge to any variable in the 
right cluster in G{F). We remark that such a clustering is already implicit in, for instance, the resolution 
lower bounds in [BWOl] for Tseitin formulas (which is essentially just a special form of unsatisfiable 
linear equations) and graph PHP formulas, as well as in the graph PHP lower bound for polynomial 
calculus in [AR03]. 

We then show that if this clustering is done in the right way, the proofs in [AR03] still go through 
and yield strong polynomial calculus degree lower bounds when G' is a good enough expander.^ It is 
clear that this cannot work in general—as already discussed above, any inconsistent system of linear 
equations mod 2 is easy to refute in polynomial calculus over F 2 , even though for a random instance of 
this problem the clauses encoding each linear equation can be clustered to yield an excellent expander G'. 
Very informally (and somewhat incorrectly) speaking, the clustering should be such that if a cluster of 
clauses F' on the left is a neighbour of a variable cluster V on the right, then there should exist an 
assignment ptoV such that p satisfies all of F' and such that for the clauses outside of F' they are either 
satisfied by p or leff complefely unfouched by p. Also, if fums ouf fo be helpful nol fo insisf fhaf fhe 
clustering of variables on fhe righf should be a parfifion, buf fhaf we should allow fhe same variable fo 
appear in several clusfers if needed (as long as fhe number of clusters for each variable is bounded). 

This extension of fhe lower bound mefhod in [AR03] makes if possible fo presenf previously obfained 
polynomial calculus degree lower bounds in [AR03, GLIO, MN14] in a unified framework. Moreover, if 
allows us fo prove fhe following new resulfs: 

1. If a bipartite graph H = {U (JV,E) wifh \U\ = m and \ V\ = n is a boundary expander (a.k.a. 
unique-neighbour expander), fhen fhe graph FPHP formula over H requires proofs of linear poly¬ 
nomial calculus degree, and hence exponenfial polynomial calculus size. 

2. Since FPHP formulas can be fumed info graph FPHP formulas by biffing fhem wifh a resfricfion, 
and since resfricfions can only decrease proof size, if follows fhaf FPHP formulas require proofs 
of exponential size in polynomial calculus. 

This fills in fhe lasf missing pieces in our undersfanding of fhe differenf flavours of pigeonhole principle 
formulas wifh n -I- 1 pigeons and n holes for polynomial calculus. Namely, while Onfo-FPHP formulas 
are easy for polynomial calculus, bofh FPHP formulas and Onfo-PHP formulas are hard even when 
resfricfed fo expander graphs. 

1.4 Organization of This Paper 

After reviewing fhe necessary preliminaries in Section 2, we presenf our extension of fhe Alekhnovich- 
Razborov mefhod in Secfion 3. In Section 4, we show how fhis mefhod can be used fo rederive some 
previous polynomial calculus degree lower bounds as well as fo obfain new degree and size lower bounds 
for funclional (graph) PHP formulas. We conclude in Secfion 5 by discussing some possible directions 
for fulure research. 

2 Preliminaries 

Lef us sfarf by giving an overview of fhe relevanf proof complexify background. This material is sfandard 
and we refer fo, for insfance, fhe survey [Nor 13] for more defails. 

A literal over a Boolean variable x is eifher fhe variable x ilself (a positive literal) or ifs negafion -ix 
or X (a negative literal). We define x = x. We identify 0 wifh frue and 1 wifh false. We remark fhaf fhis is 
fhe opposife of fhe sfandard convention in proof complexify, buf if is a more nafural choice in fhe confexf 
of polynomial calculus, where “evaluafing fo frue” means “vanishing.” A clause C = ai V ■ ■ ■ V is a 

^For a certain twist of the definition of expander that we do not describe in full detail here in order to keep the discussion at 
an informal, intuitive level. The formal description is given in Section 3.1. 
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disjunction of literals. A CNF formula F = Ci A • • • A Cm is a conjunction of clauses. The width W{C) 
of a clause C is the number of literals ICI in it, and the width W{F) of the formula F is the maximum 
width of any clause in the formula. We think of clauses and CNF formulas as sets, so that order is 
irrelevant and there are no repetitions. A A:-CNF formula has all clauses of size at most k, where k is 
assumed to be some fixed constant. 

In polynomial calculus resolution the goal is to prove the unsatisfiability of a CNF formula by rea¬ 
soning with polynomials from a polynomial ring F[x, x, y,y, ■ ■ ■] (where x and x are viewed as distinct 
formal variables) over some fixed field F. The results in this paper hold for all fields F regardless of char¬ 
acteristic. In what follows, a monomial m is a product of variables and a term t is a monomial multiplied 
by an arbitrary non-zero field element. 

Definition 2.1 (Polynomial calculus resolution (PCR) [CEI96, ABRW02]), A polynomial calculus 
resolution (PCR) refutation vr : F h ± of a CNF formula F (also referred to as a PCR proof for F) over 
a field F is an ordered sequence of polynomials vr = (Fi ,... ,Pr), expanded out as linear combinations 
of monomials, such that Pr = 1 and each line Fj, 1 < f < r, is either 

• a monomial x • OygL- V encoding a clause Va;GL+ ^ ^y&L- y in F (a clause axiom)-, 

• a Boolean axiom x^ — x or complementarity axiom x + x — 1 for any variable x; 

• a polynomial obtained from one or two previous polynomials in the sequence by linear combina- 

tion or multiplication ^ for any a, /3 G F and any variable x. 

If we drop complementarity axioms and encode each negative literal x as (1 — x), the proof system is 
called polynomial calculus (PC). 

The size 8 ( 1 ^) of a PC/PCR refutation vr = (Pi,... ,Pt-) is the number of monomials in vr (counted 
with repetitions), ^ the degree Deg(TT) is the maximal degree of any monomial appearing in vr, and the 
length L('k) is the number r of polynomials in vr. Taking the minimum over all PCR refutations of a 
formula F, we define the size S-pcn(F 1“ -L), degree Deg.pc.f^{F h ±), and length Lj,cn(F 1“ -L) of 
refuting F in PCR (and analogously for PC). 

We write Vars(C) and Vars(m) to denote the set of all variables appearing in a clause C or mono¬ 
mial (or term) m, respectively and extend this notation to CNF formulas and polynomials by taking 
unions. We use the notation {Pi,..., Pm) for the ideal generated by the polynomials Fj, i G [m]. That 
is, (Fi,... ,Pm) is the minimal subset of polynomials containing all Fj that is closed under addition 
and multiplication by any polynomial. One way of viewing a polynomial calculus (PC or PCR) refu¬ 
tation is as a calculation in the ideal generated by the encodings of clauses in F and the Boolean and 
complementarity axioms. It can be shown that such an ideal contains 1 if and only if F is unsatisfiable. 

As mentioned above, we have Degj,^.j^{F h ±) = Degj,f,(F h _L) for any CNF formula F. This 
claim can essentially be verified by taking any PCR refutation of F and replacing all occurrences of y 
by (1 — y) to obtain a valid PC refutation in the same degree. Hence, we can drop the subscript from the 
notation for the degree measure. We have the following relation between refutation size and refutation 
degree (which was originally proven for PC but the proof of which also works for PCR). 

Theorem 2.2 ([IPS99]). Let F be an unsatisfiable CNF formula of width W(F) over n variables. Then 

Thus, for fe-CNF formulas it is sufficient to prove strong enough lower bounds on the PC degree of 
refutations to establish strong lower bounds on PCR proof size. 

^We remark that the natural definition of size is to count monomials with repetition, but all lower bound techniques known 
actually establish slightly stronger lower bounds on the number of distinct monomials. 
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Furthermore, it will be convenient for us to simplify the definition of PC so that axioms — x 
are always applied implicitly whenever possible. We do this by defining the result of the multipli¬ 
cation operation to be the multilinearized version of the product. This can only decrease the degree 
(and size) of the refutation, and is in fact how polynomial calculus is defined in [AR03]. Hence, from 
now on whenever we refer to polynomials and monomials we mean multilinear polynomials and mul¬ 
tilinear monomials, respectively, and polynomial calculus is defined over the (multilinear) polynomial 
ring ¥[x, y,z,.. .\/{x'^ - x,y'^ - y, z'^ - z ,...). 

It might be worth noticing that for this modified definition of polynomial calculus it holds that any 
(unsatisfiable) fc-CNF formula can be refuted in linear length (and hence, in constrast to resolution, the 
size of refutations, rather than the length, is the right measure to focus on). This is not hard to show, and 
in some sense is probably folklore, but since it does not seem to be too widely known we state it for the 
record and provide a proof. 

Proposition 2.3. Any unsatisfiable k-CNF formula F has a (multilinear) polynomial calculus refutation 
of length linear in the size of the formula F. 

Proof We show by induction how to derive polynomials Pj = i-nLi(i — Cj) in length linear in j, 
where we identify the clause Ci in F = with the polynomial encoding of this clause. The end 

result is the polynomial Pm = — Ci). As F is unsatisfiable, for every 0-1 assignment there 

is at least one Ci that evaluates to 1 and hence Pm evaluates to 1. Thus, Pm is equal to 1 on all 0-1 
assignments. However, it is a basic fact that every function / : {0, 1}*^ —>• F is uniquely representable 
as a multilinear polynomial in F[xi, ..., Xn] (since the multilinear monomials span this vector space and 
are linearly independent, they form a basis). Therefore, it follows that Pm is syntactically equal to the 
polynomial 1. 

The base case of the induction is the polynomial Pi that is equal to Ci. To prove the induction step, 
we need to show how to derive 

j+t 

Pj+i = 1 ~ ~ = 1 ~ (1 ~ C'j-|_i)(l — Pj) = Pj + Cj+i — Cjjf-iPj (2.1) 

i=l 

from Pj and Cj+i in a constant number of steps. To start, we derive Cj+iPj from Pj, which can be 
done with a constant number of multiplications and additions since the width/degree of Cj+i is upper- 
bounded by the constant k. We derive Pj+i in two more steps by first taking a linear combination of Pj 
and Cjj^iPj to get Pj — Cjj^iPj and then adding Cj+i to this to obtain Pj — Cjj^iPj + Cj+i = Pj+i- 
The proposition follows. □ 

We will also need to use restrictions. A restriction p on P is a partial assignment to the variables 
of F. We use Dom(p) to denote the set of variables assigned by p. In a restricted formula F\p all clauses 
satisfied by p are removed and all ofher clauses have falsified literals removed. For a PC refutation tt 
restricted by p we have that if p satisfies a literal in a monomial, then that monomial is set to 0 and 
vanishes, and all falsified literals in a monomial get replaced by 1 and disappear. It is not hard to see that 
if TT is a PC (or PCR) refutation of F, then vr (p is a PC (or PCR) refutation of F \p, and this restricted 
refutation has at most the same size, degree, and length as the original refutation. 

3 A Generalization of the Alekhnovich-Razborov Method for CNFs 

Many lower bounds in proof complexity are proved by arguing in terms of expansion. One common 
approach is to associate a bipartite graph C{F) with the CNF formula F with clauses on one side and 
variables on the other and with edges encoding that a variable occurs in a clause (the so-called clause- 
variable incidence graph mentioned in the introduction). The method we present below, which is an 
extension of the techniques developed by Alekhnovich and Razborov [AR03] (but restricted to the special 
case of CNF formulas), is a variation on this theme. As already discussed, however, we will need a 
slightly more general graph construction where clauses and variables can be grouped into clusters. We 
begin by describing this construction. 
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3.1 A Generalized Clause-Variable Incidence Graph 

The key to our construction of generalized clause-variable incidence graphs is to keep track of how 
clauses in a CNF formula are affected by partial assignments. 

Definition 3.1 (Respectful assignments and variable sets). We say that a partial assignment p respects 
a CNF formula E, or that p is E-respectful, if for every clause C in E either Vars{C) n Dom(p) = 0 
or p satisfies C. A set of variables V respects a CNF formula E if there exists an assignment p with 
Dom(/ 9 ) = V that respects E. 

Example 3.2. Consider the CNF formula E = {xi A X 2 ) A (xi A X 3 ) A {xi A X 4 ) A (xi A X 5 ) and the 
subsets of variables Vi = {x\,X 2 ,x^] and V 2 = {x 4 ,X 5 }. The assignment p 2 to V 2 setting X 4 and X 5 
to true respects E since it satisfies fhe clauses confaining fhese variables, and hence V 2 is i?-respeclful. 
However, Vi is nol E-respeclful since selling xi will affecl all clauses in E bul cannol satisfy bolh xi AX4 
and xi A X5. 

Definition 3.3 (Respectful satisfaction). Let F and E be CNF formulas and let L be a set of variables. 
We say that F is E-respectfully satisfiable by V if there exists a partial assignment p with Dom(/ 9 ) = V 
that satisfies F and respects E. Such an assignment p is said to E-respectfully satisfy F. 

Using a different terminology. Definition 3.1 says that p is an autarky for E, meaning that p satisfies 
all clauses in E which it touches, i.e., that E (pC E after we remove all satisfied clauses in E fp. 
Definition 3.3 ensures that the autarky p satisfies the formula F. 

Recall that we identify a CNF formula with the set of clauses {Ci \ i G [m]}. In the rest of 

this section we will switch freely between these two perspectives. We also change to the notation E for 
the input CNF formula, to free up other letters that will be needed in notation introduced below. 

To build a bipartite graph representing the CNF formula E, we will group the formula into subfor¬ 
mulas (i.e., subsets of clauses). In what follows, we write U to denote the part of E that will form the left 
vertices of the constructed bipartite graph, while E denotes the part of E which will not be represented in 
the graph but will be used to enforce respectful satisfaction. In more detail, U is a family of subformulas 
F of E where each subformula is one vertex on the left-hand side of the graph. We also consider the 
variables of E to be divided into a family V of subsets of variables V. In our definition, U and V do not 
need to be partitions of clauses and variables in E, respectively. This is not too relevant for U because 
we will always define it as a partition, but it turns out to be useful in our applications to have sets in V 
share variables. The next definition describes the bipartite graph that we build and distinguishes between 
two types of neighbour relations in this graph. 

Definition 3.4 (Bipartite {U, "F) g-graph). Let E be a CNF formula, Uhe a set of CNF formulas, and V 
be a family of sets of variables V that respect E. Then the (bipartite) iU,V)E-graph is a bipartite graph 
with left vertices F £ If, right vertices V £V, and edges between F and V if Vars{F) n U / 0. For 
every edge [F, V) in the graph we say that F and V are E-respectful neighbours if F is E-respectfully 
satisfiable by V. Otherwise, they are E-disrespectful neighbours. 

We will often write {U,V)e as a shorthand for the graph defined by U, V, and E as above. We 
will also use standard graph notation and write N (F) to denote the set of all neighbours V G 'F of a 
vertex/CNF formula F £U. It is important to note that the fact that F and V are E-respectful neighbours 
can be witnessed by an assignment that falsfies other subformulas F'£U\{F}. 

We can view the formation of the iU,V )-graph as taking the clause-variable incidence graph G{E) 
of the CNF formula E, throwing out a part of E, which we denote E, and clustering the remaining 
clauses and variables into U and "F. The edge relation in the (U, V) g-graph follows naturally from this 
view, as we put an edge between two clusters if there is an edge between any two elements of these 
clusters. The only additional information we need to keep track of is which clause and variable clusters 
are E-respectful neighbours or not. 
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Definition 3.5 (Respectful boundary). For a (W,'F)£;-graph and a subset U' C U, the E-respectful 
boundary dE{U') oIU' is the family of variable sets V & V such that each V € dE{b(') is an F^-respectful 
neighbour of some clause set F E W but is not a neighbour (respectful or disrespectful) of any other 
clause set F' \ {F}. 

It will sometimes be convenient to interpret subsets U' FU CNF formulas I\e^u' I\c&f 
we will switch back and forth between these two interpretations as seems most suitable. We will show 
that a formula F = AfsW f\ceF CAE = Uf\E is hard for polynomial calculus with respect to degree 
if the {U,V) E-graph has a certain expansion property as defined next. 

Definition 3.6 (Respectful boundary expander). A (Z^, "F)g-graph is said to be an (s, 5, A E)-respectful 
boundary expander, or just an {s, 6, A E)-expander for brevity, if for every set U' C hi, \U'\ < s, it holds 
that \dE{W)\ > 5\U'\ — A 

Note that an (s, <5, A-®)‘r 6 spectful boundary expander is a standard bipartite boundary expander 
except for two modifications: 

• We measure expansion not in terms of the whole boundary but only in terms of the respectful 
boundary ^ as described in Definition 3.5. 

• Also, the size of the boundary \dE{bi')\ on the right does not quite have to scale linearly with the 
size of the vertex set \U'\ on the left. Instead, we allow an additive loss ^ in the expansion. In our 
applications, we can usually construct graphs with good enough expansion so that we can choose 
^ = 0 , but for one of the results we present it will be helpful to allow a small slack here. 

Before we state our main theorem we need one more technical definition, which is used to ensure 
that there do not exist variables that appear in too many variable sets in V. We remark that the concept 
below is also referred to as the “maximum degree” in the literature, but since we already have degrees of 
polynomials and vertices in this paper we prefer a new term instead of overloading “degree” with a third 
meaning. 

Definition 3.7. The overlap of a variable x with respect to a family of variable sets V is ol{x,V) = 
|{F G V : X E I/}| and the overlap of V is ol{V) = maXa;{o/(x, V)}, i.e., the maximum number of sets 
y E V containing any particular variable x. 

Given the above definitions, we can state the main technical result in this paper as follows. 

Theorem 3.8. Let T = /\e^u t\c€F^ ^ ^ = U A E be a CNF formula for which {U,V)e E an 
E)-expander with overlap ol(y) = d, and suppose furthermore that for all W C U, \hl'\ < s, 
it holds that U' A E is satisfiable. Then any polynomial calculus refutation of F requires degree strictly 
greater than {5s — 2^)/ {2d). 

In order to prove this theorem, it will be convenient to review some algebra. We do so next. 

3.2 Some Algebra Basics 

We will need to compute with polynomials modulo ideals, and in order to do so we need to have an 
ordering of monomials (which, as we recall, will always be multilinear). 

Definition 3.9 (Admissible ordering). We say that a total ordering -< on the set of all monomials over 
some fixed sef of variables is admissible if fhe following conditions hold: 

"^Somewhat intriguingly, we will not see any disrespectful neighbours in our applications in Section 4, but the concept of 
respectfulness is of crucial importance for the main technical result in Theorem 3.8 to go through. One way of seeing this is 
to construct a {U, Vjs-graph for an expanding set of linear equations mod 2, where U consists of the (CNF encodings of) the 
equations, V consists of one variable set for each equation containing exactly the variables in this equation, and E is empty. 
Then this {U, V)B-graph has the same boundary expansion as the constraint-variable incidence graph, but Theorem 3.8 does 
not apply (which it should not do) since this expansion is not respectful. 
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• If Deg{mi) < Deg{m 2 ), then mi m 2 . 

• For any mi, m 2 , and m such that mi -< m 2 and Vars{m) n (Vars(mi) U Vars{m 2 )) = 0, it 
holds that mmi -< mm 2 - 

Two terms ti = aimi and t 2 = 02"12 are ordered in the same way as their underlying monomials mi 
and m 2 . 

One example of an admissible ordering is to first order monomials with respect to their degree and 
then lexicographically. For the rest of this section we only need that ^ is some fixed but arbitrary ad¬ 
missible ordering, but the reader can think of the degree-lexicographical ordering without any particular 
loss of generality. We write mi ^ m 2 to denote that mi -< m 2 or mi = m 2 . 

Definition 3.10 (Leading, reducible, and irreducible terms). For a polynomial P = ti, the leading 
term LT{P) of P is the largest term ti according to Let / be an ideal over the (multilinear) polynomial 
ring F[a:, y,z,...]/{x^ — x,y‘^ — y, — z,...). We say that a term t is reducible modulo f if there exists 
a polynomial Q £ I such that t = LT{Q) and that t is irreducible modulo f otherwise. 

The following fact is not hard to verify. 

Fact 3.11. Let f be an ideal over F[x, y,z,.. ■]/{x‘^ — x,y‘^ — y,z^ — z ,...). Then any multilinear 
polynomial P G F[a:, y,z,.. .]/(x^ — x,y‘^ — y, z'^ — z,...) can be written uniquely as a sum Q + R, 
where Q £ I and R is a linear combination of irreducible terms modulo I. 

This is what allows us to reduce polynomials modulo an ideal in a well-defined manner. 

Definition 3.12 (Reduction operator). Let P £ F[x, y,z,.. .]/(x^ — x,y‘^ — y, z^ — z ,...) be any mul¬ 
tilinear polynomial and let / be an ideal over F[x, y,z,.. .]/(x^ — x,y‘^ — y, z‘^ — z ,...). The reduction 
operator Rj is the operator that when applied to P returns the sum of irreducible terms Ri{P) = R 
such that P — R £ I. 

We conclude our brief algebra review by stating two observations that are more or less immediate, 


but are helpful enough for us to want to highlight them explicitly. 

Observation 3.13. For any two ideals R, I 2 such that R C R and any two polynomials P, P' it holds 
that RiRP-RiRP')) = RiRPP'). 

Proof. Let 

P' = Q' + R' (3.1) 

for Q' £ R and R' a linear combination of irreducible terms over Ji. Let 

p . R^^ (p') = PR' = Q + R (3.2) 

for Q £ I 2 and R a linear combination of irreducible terms over R. Then 

PP' = PQ' + pif = PQ' + Q + R (3.3) 

where PQ' + Q £ f 2 . By the uniqueness in Fact 3.11, we conclude that the equality Ri^{PP') = R = 
RiRP ■ RiRP')) holds. □ 


Observation 3.14. Suppose that the term t is irreducible modulo the ideal I and let p be any partial 
assignment of variables in Vars{t) to values in F such thatt\pf^ 0. Then t\p is also irreducible modulo I. 

Proof. Let m^ be the product of all variables in t assigned by p and let a = mp\p, where by assumption 
we have a / 0. If there is a polynomial Q £ f such that LT{Q) = t \p, then a~^mpQ £ I and 
LT{a~^mpQ) = t, contradicting that t is irreducible. □ 
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3.3 Proof Strategy 

Let us now state the lemma on which we base the proof of Theorem 3.8. 

Lemma 3.15 ([Raz98]). Let T be any CNF formula and D E N"*" be a positive integer. Suppose that 
there exists a linear operator R on multilinear polynomials over Vars{J-) with the following properties: 

1. R{1) / 0. 

2. R{C) = Q for (the translations to polynomials of) all axioms C ^ T. 

3. For every term t with Deg{t) < D and every variable x it holds that R{xt) = R{xR{t)). 

Then any polynomial calculus refutation of J- (and hence any PCR refutation of T) requires degree 
strictly greater than D. 

The proof of Lemma 3.15 is not hard. The basic idea is that R will map all axioms to 0 by property 2, 
and further derivation steps in degree at most D will yield polynomials that also map to 0 by property 3 
and the linearity of R. But then property 1 implies that no derivation in degree at most D can reach 
contradiction. 

To prove Theorem 3.8, we construct a linear operator Rg that satisfies the conditions of Lemma 3.15 
when the (JA, V)£;-graph Q is an expander. First, let us describe how we make the connection between 
polynomials and the given {U, V)^;-graph. We remark that in the rest of this section we will identify a 
clause C with its polynomial translation and will refer to C as a (polynomial) axiom. 

Definition 3.16 (Term and polynomial neighbourhood). The neighbourhood N(t) of a term t with 
respect to (li, V)e is N{t) = {V € V | Vars{t) nV / 0}, i.e., the family of all variable sets containing 
variables mentioned by t. The neighbourhood of a polynomial P = ti is N(P) = |J- N{ti), i.e., the 
union of the neighbourhoods of all terms in P. 

To every polynomial we can now assign a family of variable sets V'. But we are interested in the 
axioms that are needed in order to produce that polynomial. That is, given a family of variable sets V', 
we would like to identify the largest set of axioms U' that could possibly have been used in a derivation 
that yielded polynomials P with Vars(P) C IJygy/ V. This is the intuition behind the next definition.^ 

Definition 3.17 (Polynomial support). For a given (U, V)£;-graph and a family of variable sets V' C V, 
we say that a subset lA' ClA 'vs (s, V')-contained if \IA'\ < s and dE{iA') C V'. 

We define fhe polynomial s-support Supg{V') ofV' with respect to (W, V)£;> or jusf s-support ofV' 
for brevify, fo be fhe union of all (s, V')-confained subsefs lA' C lA, and fhe s-supporf Supg{t) of a ferm t 
is defined fo be the s-support of N{t). 

We will usually just speak about “support” below without further qualifying this term, since the 
Vj^-graph Q will be clear from context. The next observation follows immediately from Defini¬ 
tion 3.17. 

Observation 3.18. Support is monotone in the sense that iff C t' are two terms, then it holds that 
Sup^{t) C Sup^if). 

Once we have identified fhe axioms fhaf are pofenfially involved in deriving P, we define fhe linear 
operator Rg as fhe reducfion modulo fhe ideal generafed by fhese axioms as in Definifion 3.12. We will 
show fhaf under fhe assumptions in Theorem 3.8 if holds fhaf fhis operator satisfies fhe condifions in 
Lemma 3.15. Lef us firsl infroduce some nofafion for the set of all polynomials that can be generated 
from some axioms lA' C lA. 

^We remark that Definition 3.17 is a slight modification of the original definition of support in [AR03] that was proposed 
by Yuval Filmus [Fill 4]. 
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Definition 3.19. For a {U, V)g-graph and U' C If, we write XEiJA') to denote the ideal generated by the 
polynomial axioms in lA' l\E.^ 

Definition 3.20 ( (W, V) £;-graph reduction). For a {U,V )£;-graph Q, the {U, 'F) E-graph reduction Rg 
on a term t is defined as Rg (t) = Rx^ ( Sup,,{t)) (0• Foi" ^ polynomial P, we define Rg (P) fo be fhe linear 
exfension of fhe operafor Rg defined on terms. 

Looking at Definition 3.20, it is not clear that we are making progress. On the one hand, we have 
defined Rg in ferms of sfandard reduction operafors modulo ideals, which is nice since fhere is a well- 
developed machinery for such operafors. On fhe ofher hand, if is nof clear how fo acfually compute 
using Rg. The problem is fhaf if we look af a polynomial P = ti and wanf fo compufe Rg{P), fhen 
as we expand Rg{P) = Yi we end up reducing ferms in one and fhe same polynomial modulo 

a priori completely differenl ideals. How can we gel any sense of whaf P reduces fo in such a case? The 
answer is lhal if our {If, V) £;-graph is a good enough expander, fhen Ihis is not an issue at all. Instead, it 
turns out that we can pick a suitably large ideal containing the support of all the terms in P and reduce P 
modulo this larger ideal instead without changing anything. This key result is proven in Lemma 3.25 
below. To establish this lemma, we need to develop a better understanding of polynomial support. 

3.4 Some Properties of Polynomial Support 

A crucial technical property that we will need is that if a {U, 12)£;-graph is a good expander in the sense of 
Definition 3.6, then for small enough sets V' all (s, V')-contained subsets If' Plf per Definition 3.17 
are of at most half of the allowed size. 

Lemma 3.21. Let {lf,V)E be an {s,5,^,E)-expander and let V' P V be such that |')2'| < 5s/2 — 
Then it holds that every (s, V')-contained subset If' P If is in fact {s/2, V')-contained. 

Proof. As \lf'\ < s we can appeal to the expansion property of the {If, V) g-graph to derive the inequality 
\dE{Jf')\ > ^\bl'\ — In the other direction, we can obtain an upper bound on the size of dE{lf') by 
noting that for any (s, V')-contained set If' it holds that \dE{lf')\ < |V'|. If we combine these bounds 
and use the assumption that |V'| < 5s/2 — we can conclude that \lf'\ < s/2, which proves that If' is 
{s/2, V') -contained. □ 

Even more importantly. Lemma 3.21 now allows us to conclude that for a small enough subset V' on 
the right-hand side of {If, 12) e it holds that in fact the whole polynomial s-support Supg{V') of V' on the 
left-hand side is {s/2, )2')-contained. 

Lemma 3.22. Let {lf,V)E be an {s, 5, E)-expander and let V' PV be such that |')2'| < 5s/2 — 
Then the s-support Supg{V') ofV with respect to {If, V)e i^ {s/‘2, V')-contained. 

Proof. We show that for any pair of (s, )2')-contained sets If \, If 2 C If their union lf\ U If 2 is also 
(s, V')-contained. First, by Lemma 3.21 we have \lfi\, \lf 2 \ < s/2 and hence \lfi VJlf 2 \ < s. Second, it 
holds that dE{Jfi), dE{lf 2 ) C 12', which implies that (9 e(Z7i Ulf 2 ) P V, because taking the union of two 
sets can only shrink the boundary. This establishes that Ifi U lf 2 is {s, V')-contained. 

By induction on the number of (s, 12')-contained sets we can conclude that the support Supg{V') 
is (s, 12')-contained as well, after which one final application of Lemma 3.21 shows lhal Ibis sel is 
(s/2,12')-conlained. This completes fhe proof. □ 

Whaf fhe nexl lemma says is, roughly, lhal if we reduce a ferm t modulo an ideal generated by a nof 
loo large sel of polynomials conlaining some polynomials oulside of fhe supporf of t, fhen we can remove 
all such polynomials from fhe generators of fhe ideal wifhouf changing fhe irreducible componenf of t. 

®That is, Xe{U') is the smallest set I of multilinear polynomials that contains all axioms mU' f\ E and that is closed under 
addition of Pi, P 2 € 7 and by multiplication of P € 7 by any multilinear polynomial over Vars {U A E) (where as before the 
resulting product is implicitly multilinearized). 
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Lemma 3.23. Let Q be a {U,V)E-gf'cipb and let t be any term. Suppose that W G lA is such that 
U' D Supg{t) and \U'\ < s. Then for any term t' with N{t') C N{Supg{t)) U N{t) it holds that ift' is 
reducible modulo Xe{U'), it is also reducible modulo Xe{Sup g{t)). 

Proof. \fU' is (s, A^(i))-contained, then by Definition 3.17 it holds that U' C Supg{t) and there is noth¬ 
ing to prove. Hence, assume W is not (s, A^(f))-contained. We claim that this implies that we can find 
a subformula F ^U' \ Sup^f) with a neighbouring subset of variables V E {dE^U') H N{F)') \ N{t') 
in the respectful boundary of U' but not in the neighbourhood of t'. To argue this, note that since 
\U'\ < s it follows from Definition 3.17 that the reason U' is not (s, A^(f))-contained is that there ex¬ 
ist some F ^ W and some set of variables V E N{F) such that V E dE{bi') \ N{t). Moreover, 
the assumption li' D Supg(t) implies that such an F cannot be in Supg{t). Otherwise there would 
exist an {s, N{t))-contained set U* such that F G U* C Supg{t) C U', from which it would fol¬ 
low that V E dE{U') n N{IA*) C dE{U*) C N{t), contradicting V ^ iV(t). We have shown that 
F Supgf) C bi' and V E dE^fl') n N{F), and by combining these two facts we can also deduce that 
V N{Supg{t)), since otherwise V could not be contained in the boundary of U'. In particular, this 
means that V ^ Af{t') C N{Supg{t)) U N{t), which establishes the claim made above. 

Fixing F and V such that F gU'\ Supg{t) and V E [dEibl') ON{F)) \ N{t'), our second claim is 
that if F is removed from the generators of the ideal, it still holds that if t' is reducible modulo Xe{U'), 
then this term is also reducible modulo Xe{U' \ {F}). Given this second claim we are done, since we 
can then argue by induction over the elements in W \ Supg{t) and remove them one by one to arrive at 
the conclusion that every term t' with N{t') C N{Supg{t)) U N{t) that is reducible modulo Xe^U') is 
also reducible modulo XE{Supg{t)), which is precisely what the lemma says. 

We proceed to establish this second claim. The assumption that t' is reducible modulo Xe{W) means 
that there exists a polynomial P E Xe{U') such that t' = LT (P). Since P is in the ideal Xe(U') it can be 
written as a polynomial combination P = PiCi of axioms Ci gW A E for some polynomials P,. If 
we could hit P with a restriction that satisfies (and hence removes) P while leaving t' and iU’ \ {P}) A E 
untouched, this would show that t' is the leading term of some polynomial combination of axioms in 
{W \ {P}) A P. This is almost what we are going to do. 

As our restriction p we choose any assignment with domain Dom(/9) = V that P-respectfully sat¬ 
isfies P. Note that at least one such assignment exists since V G dE{bi') H N{E) is an P-respectful 
neighbour of P by Definition 3.5. By the choice of p it holds that P is satisfied, i.e., fhaf all axioms 
in P are set to 0. Furthermore, none of the axioms in W \ {P} are affected by p since V is in the 
boundary of li'. ^ As for axioms in P it is not necessarily true that p will leave all of them untouched, 
but by assumption p respects P and so any axiom in P is either satisfied (and zeroed out) by p or is left 
intact. It follows that P can be be written as a polynomial combination P („= fP where 

Ci G {W \ {P}) A P, and hence P\pG Xe{U' \ {P}). 

To see that t' is preserved as the leading term of P \p, note that p does not assign any variables in t' 
since V ^ N{t'). Hence, t' = LT{P\p), as p can only make the other terms smaller with respect to 
This shows that there is a polynomial P' = P (pE Xe{U' \ {P}) with LT{P') = t', and hence t' is 
reducible modulo Xe{U' \ {P})- The lemma follows. □ 

We need to deal with one more detail before we can prove the key technical lemma that it is possible 
to reduce modulo suitably chosen larger ideals without changing the reduction operator, namely (again 
roughly speaking) that reducing a term modulo an ideal does not introduce any new variables outside of 
the generators of that ideal. 

Lemma 3.24. Suppose that lA* FIA for some {U, V) E-gfaph and let t be any term. Then it holds that 
N{Rx^^uu^t))<ZN{W)ON{t). 

Proof. Let P = polynomial obtained when reducing t modulo XEilA*) and let L E V 

be any set such that V ^ N{IA*) U N{t). We show that V ^ N{P). 

^Recalling the remark after Definition 3.4, we note that we can ignore here if p happens to falsify axioms mlA\U'. 
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By the definition of (ZY, V)e- graphs there exists an assignment p to all of the variables in V that 
respects E. Write t = Q + P with Q € Ze{^*) and P a linear combination of irreducible monomials 
as in Fact 3.11 and apply the restriction p to this equality. Note that t\p= f as F is not a neighbour of t. 
Moreover, Q\p is in the ideal Ze{^*) because p does not set any variables in U* and every axiom in E 
sharing variables with V is set to 0 by p. Thus, t can be written as f = Q' + P\p, with Q' E ZEiU*)- 
As all terms in P are irreducible modulo Ze{U*), they remain irreducible after restricting P by p by 
Observation 3.14. Hence, it follows that P\p= P by the uniqueness in Fact 3.11 and P cannot contain 
any variable from V. This in turn implies that every set H £ N(P) is contained in N(U*) U N{t). □ 

Now we can state the formal claim that enlarging the ideal does not change the reduction operator if 
the enlargement is done in the right way. 

Lemma 3.25. Let Q be a {U ,V)E-graph and let t be any term. Suppose that W E lA is such that 
W 5 Supg{t) and \lE\ < s. Then it holds that Rx^(u'){t) = RxE{Sup^{t)){t)- 

Proof We prove that RxE(U'){t) = ^Ie(Sup by applying the contrapositive of Lemma 3.23. 
Recall that this lemma states that any term t' with N{t') C N{Supg{t)) U N{t) that is reducible 
modulo ZEiU') is also reducible modulo ZE{Supg{t)). Since every term t' in RxE{Sup^{t)){t) ir¬ 
reducible modulo Ze{Sup g{t)) and since by applying Lemma 3.24 with U* = Supg{t) we have that 
N{t') C N{Supg{f)) U N{t), it follows that t' is also irreducible modulo Ze{U'). This shows that 
RxE{U'){t) = Rxe(Sup^( t)){i) claimed, and the lemma follows. □ 

3.5 Putting the Pieces in the Proof Together 

Now we have just a couple of lemmas left before we can prove Theorem 3.8, which as discussed above 
will be established by appealing to Lemma 3.15. 

Lemma 3.26. Let ifi, V) e be an (s, S, E)-expander with overlap oliV) = d. Then for any term t with 
Deg{t) < {6s — 2^)/{2d) it holds that \Supg{t)\ < s/2. 

Proof. Because of the bound on the overlap of (V) we have that the size of N{t) is bounded by 6s/2 — 
An application of Lemma 3.22 now yields the desired bound \Supg{t)\ < s/2. □ 

Lemma 3.27. Let {U, V) e be an (s, 6, E)-expander with overlap o/(V) = d. Then for any term t with 
Deg{t) < [(5s — 2f)/{2d)\, any term t' occurring in Rxe(Sup (t))(^)> rind any variable x, it holds that 
^Xe(Sup^{ xt')){rrt ) PxeISup^{ xt)){rrt )• 

Proof. We prove the lemma by showing that Supg{xt') C Supg{xt) and that \Supg{xt)\ < s, which 
then allows us to apply Lemma 3.25. To prove that Supg{xt') is a subset of Supg{xt), we will show that 
Supg{xt') U Supg{xf) is (s, N(xf))-contained in the sense of Definition 3.17. From this it follows that 
Supg{xt') C Supg{xt') U Supg{xt) = Supg{xt). 

Towards this goal, as Deg{t') < Deg{f) we first observe that we can apply Lemma 3.26 to deduce 
that \Supg{xt')\ < s/2 and \Supg{xt)\ < s/2, and hence \Supg{xt') U Supg{xt)\ < s, which satisfies 
fhe size condifion for confainmenf. If remains fo show fhaf dE{Supg{xt') U Supg{xt)) C N{xt). From 
Lemma 3.24 we have fhaf N{t') C N{Supg{t)) U N{f). As N{xt') = N{x) U N{t') and Supg{t) C 
Supg{xt) by fhe monofonicify in Observafion 3.18, if follows fhaf 

N{xt') = N{x) U N{t') C N{x) U N{Supg{t)) U N{t) C N{Supg{xt)) U N{xt) . (3.4) 

If we now consider fhe P-respeclfuI boundary of fhe sef Supg{xt') U Supg{xt), if holds fhaf 

Oe iySupg{xt') U Supg{xt)) = 

= {Oe {Supg{xt')) \ N {Supg{xt))) U {dE {Supg{xt)) \ N {Supg{xt'))) 

C {N {xt') \ N {Supg{xt))^ U {N (xt) \ N {Supg{xt'))') 

C N (xt) , 
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where the first line follows from the boundary definition in Definition 3.5, the second line follows by 
the property of s-support that dE{Supg{xt)) C N{xt), and the last line follows from (3.4). Hence, 
Supg{xt') U Supg{xt) is (s, A^(xf))-contained. 

As discussed above, we can now apply Lemma 3.25 to reach the desired conclusion that the equality 

^XE{Sup^{xt ')){^^) ^iEiSupg(xt))i^t ) holds. Cl 


Now we can prove our main technical theorem. 

Proof of Theorem 3.8. Recall that the assumptions of the theorem are that we have a {U, V)£;-graph for 
a CNF formula = AfgW ^ ^ ^ such that {U, V) e is an (s, 5, A F^)-expander with overlap of (V) = d 
and that furthermore for all U' C hi, \U'\ < s, it holds that AfgW' ^ A S is satisfiable. We want to prove 
that no polynomial calculus derivation from AfgW F/\E=U/\E of degree at most {5s — 2^) / (2d) 
can reach contradiction. 

First, if removing all axiom clauses from lA l\ E with degree strictly greater than (ds — 2^)1 {2d) 
produces a satisfiable formula, then the lower bound trivially holds. Otherwise, we can remove these 
large-degree axioms and still be left with a (W, V)£;-graph that satisfies the conditions above. In order to 
see this, let us analyze what happens to the iU, 12 )^-graph if an axiom is removed from the formula. 

Removing axioms from E only relaxes the conditions on respectful satisfiability while keeping all 
edges in the graph, so the conditions of the theorem still hold. In removing axioms from U we have 
two cases: either we remove all axioms from some subformula F G W or we remove only a part of this 
subformula. In the former case, it is clear that we can remove the vertex F from the structure without 
affecting any of the conditions. In the latter case, we claim that any set H € V that is an F^-respectful 
neighbour of F remains an £^-respectful neighbour of the formula F' in which large degree axioms have 
been removed. Clearly, the same assignments to V that satisfy F also satisfy F' C F. Also, V must still 
be a neighbour of F', for otherwise F' would not share any variables with V, which would imply that 
no assignment to V could satisfy F' and hence F. This would contradict the assumption that V is an 
£'-respectful neighbour of F. Hence, we conclude that removal of large-degree axioms can only improve 
the £^-respectful boundary expansion of the (Z^, V)£;-graph. 

Thus, let us focus on a (Z^,V)£;-graph Q that has all axioms of degree at most ((5s — 2^)/{2d). We 
want to show that the operator Rg from Definition 3.20 satisfies the conditions of Lemma 3.15, from 
which Theorem 3.8 immediately follows. We can note right away that the operator Rg is linear by 
construction. 

To prove that Rg{f) = Rxe{SupAV) ^ observing that the size of the s-support of 1 

is upper-bounded by s/2 according to Lemma 3.26. Using the assumption that for every subset U' ofU, 
\U'\ < s, the formula W A F is satisfiable, it follows that 1 is not in the ideal and hence 

■^X£;(5'wpg(l)) (^) 7“ 

We next show that Rg{C) = 0 for any axiom clause C A F (where we recall that we identify a 
clause C with its translation into a linear combination of monomials). By the preprocessing step above 
it holds that the degree of C is bounded by {5s — 2^)/{2d), from which it follows by Lemma 3.26 that 
the size of the s-support of every term in C is bounded by s/2. Since C is the polynomial encoding 
of a clause, the leading term LT{C) contains all the variables appearing in C. ^ Hence, the s-support 
Supg{LT{C)) of the leading term contains the s-support of every other term in C by Observation 3.18 
and we can use Lemma 3.25 to conclude that Rg{C) = Rxe{SupAlt{C))){E). \fC & F, this means we 
are done because Ze{Sup g{LT{C))) contains all of F, implying that Rg{C) = 0. 

For C we cannot immediately argue that C reduces to 0, since (in contrast to [AR03]) it is not 
immediately clear that Supg{LT{C)) contains C. The problem here is that we might worry that C is part 
of some subformula F € for which the boundary dE{F) is not contained in N{LT{C)) = Vars{C), 
and hence there is no obvious reason why C should be a member of any (s, A"(Lr(C')))-contained 
subset of U. However, in view of Lemma 3.25 (applied, strictly speaking, once for every term in C) 
we can choose some F £ U such that C £ F and add it to the s-support Supg{LT{C)) to obtain a set 

*We remark that this is the only place in the proof where we are using that C is (the encoding of) a clause. 
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W = Sup^{LT{C)) U {F} of size \U'\ <s/2 + l<s such that f?i^(<?«p,(LT(C)))(C') = RiEiU'){C)- 
Since Ze{U') contains C as a generator we conclude that Rg{C) = Rxe{U'){C) — 0 for C £U.'^ 
It remains to prove the last property in Lemma 3.15 stating that Rg (xt) = Rg {xRg {t) ) for any term t 
such that Deg{t) < [(5s — 2f^)/{2d)\. We can see that this holds by studying the following sequence of 
equalities: 


Rg{xRg{t)) = ^ Rg{xt') 

t'&Rg(t) 

^ y RiE{SUPg{xt'))i^^ ) 
t'&Rg(t) 

= ^ Rxe{Sup^{ xt)){xt ) 

t'&Rg(t) 

RlE{Supg{xt)){^RG{^)) 

Rre(S up^(xt)) i^RxE{Supg(t)) (^)) 
RlE{Sup^(xt))i^^) 

= Rg{xt) 


[by linearity] 

[by definition of i?g] 

[by Lemma 3.27] 

[by linearity] 

[by definition of i?g] 
[by Observation 3.13] 
[by definition of Rg'\ 


Thus, Rg satisfies all fhe properfies of Lemma 3.15, from which fhe fheorem follows. 


□ 


Lef us nexf show fhaf if fhe slack ^ in Theorem 3.8 is zero, fhen fhe condition fhaf U' A E is safisfiable 
for sufficienlly small W is already implied by fhe expansion. 

Lemma 3.28. If a iff,V) E-graph is an {s,5,0, E)-expander and Vars{U A E) = Ui/eV^’ then for 
any U' C U, \U'\ < s, the formula W A E is satisfiable. 

Proof GetU' C be any subsef of size af mosf s. Firsf, we show fhaf we can find a subsef V' C N{Li') 
and an assignmenf p fo fhe sef of variables UyeV' ^ P ^-respectfully safisfies U'. We do fhis 

by inducfion on fhe number of formulas in W. As fhe {U, V)g-graph is an (s, 5, 0, i7)-expander if follows 
fhaf \dE(fl')\ > 5\U.'\ > 0 for any non-emply subsef hi' and hence fhere exisfs a formula E £ If' and 
a variable sef V' such fhaf V' is an i7-respectful neighbour of E and is nol a neighbour of any formula 
in If' \ {F}. Therefore, fhere is an assignmenf p fo fhe variables in V' fhaf i7-respecffully satisfies F. 
By fhe inducfion hypofhesis fhere also exisfs an assignmenf p' fhaf i7-respecffully safisfies U' \ {F} and 
does nof assign any variables in V' as V' ^ NfJ' \ {F}). Hence, by extending fhe assignmenf p' fo fhe 
variables in V' according fo fhe assignmenf p, we creafe an assignmenf fo fhe union of variables in some 
subsef of N{lf') fhaf F-respecffully safisfies If'. 

We now need fo show how fo exfend fhis fo an assignmenf satisfying also E. To fhis end, lef pi/i be an 
assignmenf fhaf F-respecffully safisfies If' and assigns fhe variables in Uyev' ^ some V' F N{lf'). 
By anofher induction over fhe size |V"\ V'| of families V" F V', we show fhaf fhere is an assignmenf pv" 
fo fhe variables Uyev" ^ F-respeclfully safisfies If' for every V' C V" F V. When V" = V', we 
jusf fake fhe assignmenf pw- We wanf fo show fhaf for any V' F V\V'' we can exfend pv" to the variables 
in V' so fhaf fhe new assignmenf F-respecffully safisfies U'. As V' respecfs F, fhere is an assignmenf 
pv> to fhe variables V' fhaf safisfies all affecled clauses in F. We would like fo combine py and pv" 
info one assignmenf, buf fhis requires some care since fhe intersection of fhe domains V' n (Uyev" 
could be non-emply. Consider fherefore fhe subassignmenl pf, of pv thal assigns only fhe variables in 
V' \ (UyeV" ^)- claim fhaf extending pv" by pf, creates an assignmenf fhaf respecfs F. This is 
because every clause in F fhaf has a variable in V' and was nof already satisfied by pv» cannof have 
variables in V' n (Uygv" Pv” would have been F-disrespeclful) and hence every such clause 

musf be satisfied by fhe subassignmenl pf,. 

^Actually, a slighly more careful argument reveals that C is always contained in Supg{LT{C)). This is so since for any 
F £ 14 with C G F it holds that any neighbours in N{F) \ N{LT{C)) have to be disrespectful, and so such an F always 
makes it into the support. However, the reasoning gets a bit more involved, and since we already needed to use Lemma 3.25 
anyway we might as well apply it once more here. 
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Thus, we can find an assignment to all the variables Uygv^ that T^-respectfully satisfies U'. As V 
includes all the variables in E it means that E is also fully satisfied. Hence, U' A E is satisfiable and the 
lemma follows. □ 

This allows us to conclude this section by stating the following version of Theorem 3.8 for the most 
commonly occuring case with standard expansion without any slack. 

Corollary 3.29. Suppose that is an {s,6,0, E)-expander with overlap ol{V) = d such that 

Varsipl A E) = Uygv polynomial calculus refutation of the formula /\p^i^ E AE requires 

degree strictly greater than 5s/{2d). 

Proof This follows immediately by plugging Lemma 3.28 into Theorem 3.8. □ 

4 Applications 

In this section, we demonstrate how to use the machinery developed in Section 3 to establish degree 
lower bounds for polynomial calculus. Let us warm up by reproving the bound from [AR03] for CNF 
formulas F whose clause-variable incidence graphs G{F) are good enough expanders. We first recall 
the expansion concept used in [AR03] for ordinary bipartite graphs. 

Definition 4.1 (Bipartite boundary expander). A bipartite graph G = ((7 U C, i?) is a bipartite 
{s, 5)-boundary expander if for every set of vertices U' C U, |C/'| < s, it holds that \d{U' )i > mi 
where the boundary d{U') = {u G H : |A’(t!) n U'\ = l} consists of all vertices on the right-hand 
side V that have a unique neighbour in U' on the left-hand side. 

We can simply identify the (7/, V)^;-graph with the standard clause-variable incidence graph G(F) 
to recover the degree lower bound in [AR03] as stated next. 

Theorem 4.2 ([AR03]). For any CNF formula F and any constant 5 > D it holds that if the clause- 
variable incidence graph G{F) is an (s, 5)-boundary expander, then the polynomial calculus degree 
required to refute F in polynomial calculus is Deg{F h _L) > 5s/2. 

Proof To choose G{F) as our {U, ')2)E-graph, we set E to be the empty formula, U to be the set of 
clauses of F interpreted as one-clause CNF formulas, and V to be the set of variables partitioned into 
singleton sets. As E is an empty formula every set V respects it. Also, every neighbour of some 
clause G £ U is an i7-respectful neighbour because we can set the neighbouring variable so that the 
clause G £ U is satisfied. Under this interpretation G{F) is an (s, <5,0, i7)-expander, and hence by 
Corollary 3.29 the degree of refuting F is greater than 5s/2. □ 

As a second application, which is more interesting in the sense that the {U, ')2)£;-graph is nontrivial, 
we show how the degree lower bound for the ordering principle formulas in [GLIO] can be established 
using this framework. For an undirected (and in general non-bipartite) graph G, the graph ordering 
principle formula GOP{G) says that there exists a totally ordered set of |U(G)| elements where no 
element is minimal, since every element/vertex v has a neighbour u £ N{v) which is smaller according 
to the ordering. Formally, the CNF formula GOP{G) is defined over variables Xu,v^ u,v £ V {G), u v, 
where the intended meaning of the variables is that Xu,v is true ifu<v according to the ordering, and 
consists of the following axiom clauses: 



u,v,w £ V{G),u V w u 

(transitivity) 

(4.1a) 

Xu^V V Xy^u 

u,v £ V{G),u 7 ^ V 

(anti-symmetry) 

(4.1b) 

Xu,V V Xy^u 

u,v £ V{G),u 7 ^ V 

(totality) 

(4.1c) 

\/ Xy^y 

V £ V{G) 

(non-minimality) 

(4. Id) 


u^N{y) 
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We remark that the graph ordering principle on the complete graph Kn on n vertices is the (linear) 
ordering principle formula LOPn (also known as a least number principle formula, or graph tautology in 
the literature), for which the non-minimality axioms (4. Id) have width linear in n. By instead considering 
graph ordering formulas for graphs G of bounded degree, one can bring the initial width of the formulas 
down so that the question of degree lower bounds becomes meaningful. 

To prove degree lower bounds for GOP{G) we need the following extension of boundary expansion 
to the case of non-bipartite graphs. 

Definition 4.3 (Non-bipartite boundary expander), A graph G = {V,E) is an {s, 5)-boundary ex¬ 
pander if for every subset of vertices V' C V{G), \V'\ < s, it holds that |i9(1^0l ^ where the 

boundary 0 ( 1 /') = | u G 1 ^ (G) \ y' : | ^"(7;) n 1 /' | = 11 is the set of all vertices in 1 /(G) \ 1 /' that have 
a unique neighbour in V'. 

We want to point out that the definition of expansion used by Galesi and Lauria in [GLIO] is 
slightly weaker in that they do not require boundary expansion but just vertex expansion (measured 
as |Af(l/') \ V'\ for vertex sets V with \V'\ < s), and hence their result is slightly stronger than what 
we state below in Theorem 4.4. With some modifications of the definition of £^-respectful boundary in 
(Z4, Vjs-graphs it would be possible to match the lower bound in [GLIO], but it would also make the 
definitions more cumbersome and so we choose not to do so here. 

Theorem 4.4 ([GLIO]). For a non-bipartite graph G that is an (s, 5)-boundary expander it holds that 
Deg (GOP (G) h±) > 5s/A. 

Proof To form the (U, V)£;-graph for GOP(G), we let E consist of all transitivity axioms (4.1a), anti¬ 
symmetry axioms (4.1b), and totality axioms (4.1c). The non-minimality axioms (4. Id) viewed as sin¬ 
gleton sets form the family U, while V is the family of variable sets I 4 for each vertex v containing all 
variables that mention v, i.e., I 4 = {xu,w \ u,w £V (G), u = vorw = v}. 

For a vertex u, the neighbours of a non-minimality axiom Fu = V^,eA^(M) Xv,u ^ U axe variable 
sets Vy where v is either equal to u or a neighbour of u in G. We can prove that each Vy £ N(Fy) is an 
£'-respectful neighbour of Fy (although the particular neighbour Vy will not contribute in the proof of 
the lower bound). then setting all the variables Xy^yy G I 4 to true and all the variables Xyj^y G Vy 

to false (i.e., making v into the minimal element of the set) satisfies Fy as well as all the affected axioms 
in E. If V = u, we can use a complementary assignment to the one above (i.e., making v = u into the 
maximal element of the set) to L^-respectfully satisfy Fy. Observe that this also shows that all Vy £V 
respect E as required by Definition 3.4. 

By the analysis above, it holds that the boundary d(V') of some vertex set V in G yields the 
L^-respectful boundary ^u) D {Vy \ v G 5(1/')} in (IJ,V)e- Thus, the expansion parameters 

for (U, V)E are the same as those for G and we can conclude that (U, V )e is an (s, 5, 0, i?)-expander. 

Finally, we note that while V is not a partition of the variables of GOP(G), the overlap is only 
oliy) = 2 since every variable Xy^y occurs in exactly two sets I 4 and Vy in V. Hence, by Corollary 3.29 
the degree of refuting GOP(G) is greater than 5s/A. □ 

With the previous theorem in hand, we can prove (a version of) the main result in [GLIO], namely 
that there exists a family of 5-CNF formulas witnessing that the lower bound on size in terms of degree 
in Theorem 2.2 is essentially optimal. That is, there are formulas over N variables that can be refuted 
in polynomial calculus (in fact, in resolution) in size polynomial in N but require degree f}(s/N). This 
follows by plugging expanders with suitable parameters into Theorem 4.4. By standard calculations (see, 
for example, [HLW06]) one can show that there exist constants 7, 5 > 0 such that randomly sampled 
graphs on n vertices with degree at most 5 are (yn, 5)-boundary expanders in the sense of Definition 4.3 
with high probability. By Theorem 4.4, graph ordering principle formulas on such graphs yield 5-CNF 
formulas over 0(n^) variables that require degree D(n). Since these formulas have polynomial calculus 
refutations in size O(n^) (just mimicking the resolution refutations constructed in [Sta96]), this shows 
that the bound in Theorem 2.2 is essentially tight. The difference between this bound and [GLIO] is that 
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since a weaker form of expansion is required in [GLIO] it is possible to use 3-regular graphs, yielding 
families of 3-CNF formulas. 

Let us now turn our attention back to bipartite graphs and consider different flavours of pigeonhole 
principle formulas. We will focus on formulas over bounded-degree bipartite graphs, where we will 
convert standard bipartite boundary expansion as in Definition 4.1 into respectful boundary expansion as 
in Definition 3.6. For a bipartite graph G = {U UV, E) the axioms appearing in the different versions of 
the graph pigeonhole principle formulas are as follows: 


\/ ^u,v 

uGU 


(pigeon axioms) 

(4.2a) 

>£N{u) 





U,V V 

V ^V, u, 

u' G N{v), u 7^ u', 

(hole axioms) 

(4.2b) 

U,V V Xy^yf 

u £ U, V, 

v' £ N{u), V yk v' 

(functionality axioms) 

(4.2c) 

\/ Xy^y 

veV 


(onto axioms) 

(4.2d) 

iGN{v) 






The “plain vanilla” graph pigeonhole principle formula PHPq is the CNF formula over variables 
{xu,v I {u,v) G E'\ consisting of clauses (4.2a) and (4.2b); the graph functional pigeonhole principle 
formula FPHPq contains the clauses of PHPg and in addition clauses (4.2c); the graph onto pigeon¬ 
hole principle formula Onto-PHPc contains PHPg plus clauses (4.2d); and the graph onto functional 
pigeonhole principle formula Onto-FPHPG consists of all the clauses (4.2a)-(4.2d). 

We obtain the standard versions of the PHP formulas by considering graph formulas as above over the 
complete bipartite graph Kn+i,n- In the opposite direction, for any bipartite graph G with n -|- 1 vertices 
on the left and n vertices on the right we can hit any version of the pigeonhole principle formula 
over Kn+i,n with the restriction pG setting Xu,v to false for all {u,v) ^ E{G) to recover the corre¬ 
sponding graph pigeonhole principle formula over G. When doing so, we will use the observation from 
Section 2 that restricting a formula can only decrease the size and degree required to refute it. 

As mentioned in Section 1, it was established already in [AR03] that good bipartite boundary ex¬ 
panders G yield formulas PHPg that require large polynomial calculus degree to refute. We can reprove 
this result in our language—and, in fact, observe that the lower bound in [AR03] works also for the onto 
version Onto-PHPG —by constructing an appropriate (Z^,')2)E-graph. In addition, we can generalize 
the result in [AR03] slightly by allowing some additive slack ^ > 0 in the expansion in Theorem 3.8. 
This works as long as we have the guarantee that no too small subformulas are unsatisfiable. 

Theorem 4.5. Suppose that G = {U iJ V, E) is a bipartite graph with \U\ = n and \ V\ = n — 1 and 
that 6 > 0 is a constant such that 

• for every set U' GU of size \U'\ < s there is a matching ofU' into V, and 

• for every set U' C U of size \U'\ < s it holds that |c)((7')| > <5|?7'| — 

Then Deg {Onto-PHP g E) > 5s/2 — 

Proof sketch. The {U, V) ^-graph for PHPg is formed by taking U to be the set of pigeon axioms (4.2a), 
E to consist of the hole axioms (4.2b) and onto axioms (4.2d), and V to be the collection of variable sets 
Vy = {xu,v I u G N{v)} partitioned with respect to the holes v ^V.\t is straightforward to check that 
this iU, V)£;-graph is isomorphic to the graph G and that all neighbours in {U, V) e are i?-respectful (for 
\lv&N(u) and Vy for some v G N{u), apply the partial assignment sending pigeon u to hole v 

and ruling out all other pigeons in N(v) \ {u} for v). Moreover, using the existence of matchings for 
all sets of pigeons U' of size \U'\ < s we can prove that every subformula U' A E is satisfiable as long 

as \U'\ < s. Hence, we can apply Theorem 3.8 to derive the claimed bound. We refer to the upcoming 

full-length version of [MN14] for the omitted details. □ 

Theorem 4.5 is the only place in this paper where we use non-zero slack for the expansion. The 
reason that we need slack is so that we can establish lower bounds for another type of formulas, namely 
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the subset cardinality formulas studied in [SpelO, VS 10, MN14]. A brief (and somewhat informal) 
description of these formulas is as follows. We start with a 4-regular bipartite graph to which we add an 
extra edge between two non-connected vertices. We then write down clauses stating that each degree-4 
vertex on the left has at least 2 of its edges set to true, while the single degree-5 vertex has a strict majority 
of 3 incident edges set to true. On the right-hand side of the graph we encode the opposite, namely that 
all vertices with degree 4 have at least 2 of its edges set to false, while the vertex with degree 5 has at 
least 3 edges set to false. A simple counting argument yields that the CNF formula consisting of these 
clauses must be unsatisfiable. Formally, we have the following definition (which strictly speaking is a 
slightly specialized case of the general construction, but again we refer to [MN14] for the details). 

Definition 4.6 (Subset cardinality formulas [VSIO, MN14] ). Suppose that G = {U U V, E) is a 

bipartite graph that is 4-regular except that one extra edge has been added between two unconnected 
vertices on the left and right. Then the subset cardinality formula SC (G) over G has variables Xe,e £ E, 
and clauses: 

• Xei V Xe 2 V Xeg for every triple ei, 62,63 of edges incident to any u £ U, 

• Xei V Xe 2 V Xe 3 for cvcry triple ei, 62,63 of edges incident to any v £V. 

To prove lower bounds on refutation degree for these formulas we use the standard notion of vertex 
expansion on bipartite graphs, where all neighbours on the left are counted and not just unique neighbours 
as in Definition 4.1 . 

Definition 4.7 (Bipartite expander). A bipartite graph G = {U (JV, E) is a bipartite (s, 6 )-expander if 
for each vertex set C/' C [/, |C/'| < s, it holds that |N(C/')| > 6\U'\. 

The existence of such expanders with appropriate parameters can again be established by straightfor¬ 
ward calculations (as in, for instance, [HLW06]). 

Theorem 4.8 ([MN14]). Suppose that G = {U Li V,E) is a A-regular bipartite (yn, | -|- 6 ^-expander 
for \U\ = \ V\ = n and some constants y,5 > 0, and let G' be obtained from G by adding an arbitrary 
edge between two unconnected vertices in U and V. Then refuting the formula SC{G') requires degree 
Deg{SC{G') FT) = D(n), and hence size S-pciz{SC{G') FT) = exp(D(n)). 

Proof sketch. The proof is by reducing to graph PHP formulas and applying Theorem 4.5 (which of 
course also holds with onto axioms removed). We fix some complete matching in G, which is guaranteed 
to exist in regular bipartite graphs, and then set all edges in the matching as well as the extra added edge 
to true. Now the degree-5 vertex v* on the right has only 3 neighbours and the constraint for v* requires 
all of these edges to be set to false. Hence, we set these edges to false as well which makes v* and its 
clauses vanish from the formula. The restriction leaves us with n vertices on the left which require that 
at least 1 of the remaining 3 edges incident to them is true, while the n — 1 vertices on the right require 
that at most 1 out of their incident edges is true. That is, we have restricted our subset cardinality formula 
to obtain a graph PHP formula. 

As the original graph is a ( 777 ,, | -|- ())-expander, a simple calculation can convince us that the new 
graph is a boundary expander where each set of vertices U' on the left with size \ U'\ < 'jn has boundary 
expansion \d{U')\ > 26\U'\ — 1. Note the additive slack of 1 compared to the usual expansion condi¬ 
tion, which is caused by the removal of the degree-5 vertex v* from the right. Now we can appeal to 
Theorem 4.5 (and Theorem 2.2) to obtain the lower bounds claimed in the theorem. □ 

Let us conclude this section by presenting our new lower bounds for the functional pigeonhole prin¬ 
ciple formulas. As a first attempt, we could try to reason as in the proof of Theorem 4.5 (but adding 
the axioms (4.2c) and removing axioms (4.2d)). The naive idea would be to modify our {U, Vje-graph 
slightly by substituting the functionality axioms for the onto axioms in E while keeping U and V the 
same. This does not work, however—although the sets I 4 G V are S-respectful, the only assignment 
that respects E is the one that sets all variables Xu,v F Vv to false. Thus, it is not possible to satisfy any 
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of the pigeon axioms, meaning that there are no £'-respectful neighbours in {U, V) e- In order to obtain 
a useful {U, V)E-graph, we instead need to redefine V by enlarging the variable sets I 4 , using the fact 
that V is not required to be a partition. Doing so in the appropriate way yields the following theorem. 

Theorem 4.9, Suppose that G = (U ij V, E) is a bipartite (s, 5)-boundary expander with left degree 
bounded by d. Then it holds that refuting FPHPq in polynomial calculus requires degree strictly greater 
than 6s/{2d). It follows that if G is a bipartite {'yn,6)-boundary expander with constant left degree 
and 7 , (5 > 0, then any polynomial calculus (PC orPCR) refutation of FPHPc requires size exp(D(n)). 

Proof We construct a {U, V)£:-graph from FPHPq as follows. We let the set of clauses E consist of 
all hole axioms (4.2b) and functionality axioms (4.2c). We define fhe family U fo consisf of fhe pigeon 
axioms (4.2a) inferprefed as singlefon CNF formulas. For fhe variables weleiV = {V^ \ v where 

for every hole v fhe sef I 4 is defined by 

14 = {xu',v'\u' C N{v) and v' G iV(n^)} . (4.3) 

Thai is, fo build I 4 we sfarf wifh fhe hole v on fhe righf, consider all pigeons u' on fhe lefl fhaf can go 
info fhis hole, and finally include in I 4 for all such u' fhe variables Xu’^v’ for all holes v' incidenf fo u'. 
We wanf fo show fhaf (JA,V)e as defined above safisfies fhe condifions in Corollary 3.29. 

Nofe firsl fhaf every variable sef I 4 respecfs fhe clause sef E since selling all variables in I 4 to false 
safisfies all clauses in E mentioning variables in 14- It is easy fo see from (4.3) fhaf when a hole n is a 
neighbour of a pigeon u, fhe variable sef I 4 is also a neighbour in fhe {U,V )e- graph of fhe corresponding 
pigeon axiom Fu = \l v^n{u) Xu,v These are fhe only neighbours of fhe pigeon axiom as each I 4 
conlains only variables menlioning pigeons in fhe neighbourhood of In olher words, G and {U,V)e 
share fhe same neighbourhood slruclure. 

Moreover, we claim fhaf every neighbour I 4 of Fu is an F^-respeclful neighbour. To see fhis, consider 
fhe assignmenl pu^v thal sels Xu,v to true and fhe remaining variables in I 4 to false. Clearly, Fu is safisfied 
by pu^v All axioms in E not conlaining Xu,v are eilher safisfied by pu^v or lefl unfouched, since pu,v 
assigns all olher variables in ils domain fo false. Any hole axiom Xu,v VXu',v in thal does confain Xu,v 
is satisfied by pu^v since Xu',v C 14 for u' G N{v) by (4.3) and fhis variable is sef fo false by pu^v In 
fhe same way, any funclionalily axiom Xu^v V Xu,v' conlaining Xu^v is satisfied since fhe variable Xu^v' is 
in 14 by (4.3) and is hence assigned fo false. Thus, fhe assignmenl pu^v F^-respeclfully safisfies Fu, and 
so Fu and I 4 nre F^-respeclful neighbours as claimed. 

Since our conslrucled {U,V) e- graph is isomorphic fo fhe original graph G and all neighbour relations 
are respeclful, fhe expansion paramefers of G Irivially carry over fo respecfful expansion in {II,V)e- This 
is jusf anolher way of saying fhaf (Z^, ') 2 )e is an (s, 0 , F^)-expander. 

To finish fhe proof, nofe fhaf fhe overlap of V is af mosl d. This is so since a variable Xu,v appears 
in a sef 14' only when v' G N{u). Hence, for all variables Xu,v it holds fhaf Ihey appear in al mosf 
|At(n)| < d sels in 12. Now fhe conclusion fhaf any polynomial calculus refulalion of FPHPq requires 
degree greater lhan 6s/{2d) can be read off from Corollary 3.29. In addilion, fhe exponential lower 
bound on fhe size of a refulalion of FPHPq when G is a {yn, (5)-boundary expander G wifh conslanl 
lefl degree follows by plugging fhe degree lower bound info Theorem 2.2. □ 

If is nol hard fo show (again we refer fo [HLW06] for fhe delails) fhaf fhere exisl biparfile graphs wifh 
lefl degree 3 which are (yn, (i)-boundary expanders for 7 , <5 > 0 and hence our size lower bound for poly¬ 
nomial calculus refulalions of FPHPq can be applied fo Ihem. Moreover, if |(7| = n + 1 and \V\ = n, 
Ihen we can identify some bipartite graph G fhaf is a good expander and hil FPHPlf^^ = FPHP k^+i n 
wifh a reslricfion pQ selling Xu,v to false for all {u,v) (/ E to obfain FPHP/;+^ \p^= FPHPq. Since 
reslriclions can only decrease refulalion size, if follows fhaf size lower bounds for FPHPq apply also fo 
FPHP/^^, yielding fhe second lower bound claimed in Section 1.3. 

Theorem 4.10. Any polynomial calculus or polynomial calculus resolution refutation of (the standard 
CNF encoding of) the functional pigeonhole principle FPHP'^f^^ requires size exp(H(n)). 
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5 Concluding Remarks 

In this work, we extend the techniques developed by Alekhnovich and Razborov [AR03] for proving 
degree lower bounds on refutations of CNF formulas in polynomial calculus. Instead of looking at the 
clause-variable incidence graph G(F) of the formula F as in [AR03], we allow clustering of clauses and 
variables and reason in terms of the incidence graph G' defined on these clusters. We show that the CNF 
formula F requires high degree to be refuted in polynomial calculus whenever this clustering can be 
done in a way that “respects the structure” of the formula and so that the resulting graph G' has certain 
expansion properties. 

This provides us with a unified framework wifhin which we can reprove previously esfablished de¬ 
gree lower bounds in [AR03, GLIO, MN14]. More imporfanfly, fhis also allows us fo obfain a degree 
lower bound on fhe functional pigeonhole principle defined on expander graphs, solving an open prob¬ 
lem from [Raz02]. If immediafely follows from fhis fhaf fhe (sfandard CNF encodings of) fhe usual 
funcfional pigeonhole principle formulas require exponential proof size in polynomial calculus resolu- 
fion, resolving a quesfion on Razborov’s problems lisf [Razl5] which had (quife annoyingly) remained 
open. This means fhaf we now have an essentially complefe undersfanding of how fhe differenl vari- 
anfs of pigeonhole principle formulas behave wifh respecf fo polynomial calculus in fhe sfandard seffing 
wifh n + 1 pigeons and n holes. Namely, while Onfo-FPHP formulas are easy, bofh FPHP formulas and 
Onfo-PHP formulas are exponentially hard in n even when resfricfed fo bounded-degree expanders. 

A nafural nexf step would be fo see if fhis generalized framework can also be used fo affack ofher 
inferesfing formula families which are known fo be hard for resolution buf for which fhere are currenfly 
no lower bounds in polynomial calculus. In parficular, can our framework or some modification of if 
prove a lower bound for refuling fhe formulas encoding fhaf a graph does nol confain an independenf sef 
of size k, which were proven hard for resolufion in [BIS07]? Or whaf abouf fhe formulas sfafing fhaf a 
graph is fe-colorable, for which resolufion lower bounds were esfablished in [BCMM05]? 

Reluming fo fhe pigeonhole principle, we now undersfand how differenl encodings behave in poly¬ 
nomial calculus when we have n + 1 pigeons and n holes. Buf whaf happens when we increase fhe 
number of pigeons? For insfance, do fhe formulas become easier if we have pigeons and n holes? 
(This is fhe poinl where lower bound techniques based on degree break down.) Whaf abouf arbilrary 
many pigeons? In resolufion fhese questions are fairly well understood, as wilnessed by fhe works of 
Raz [Raz04a] and Razborov [RazOl, Raz03, Raz04b], buf as far as we are aware Ihey remain wide open 
for polynomial calculus. 

Finally, we wanl fo poinl ouf an infriguing confrasl belween our work and fhaf of Alekhnovich and 
Razborov. As discussed in fhe inlroducfion, fhe main technical resulf in [AR03] is fhaf when fhe incidence 
graph of a sef of polynomial equafions is expanding and fhe polynomials are immune, i.e., have no low- 
degree consequences, Ihen refuting fhis sef of equafions is hard wifh respecf fo polynomial calculus 
degree. Since clauses of widlh w have maximal immunily w, if follows fhaf for a CNF formula F 
expansion of fhe clause-variable incidence graph G{F) is enough fo imply hardness. A nafural way 
of inferpreling our work would be fo say fhaf we simply extend fhis resulf fo a slighlly more general 
conslrainl-variable incidence graph. On closer inspection, however, fhis analogy seems to be misleading, 
and since we were quite surprised by fhis ourselves we wanf fo elaborafe briefly on fhis. 

For fhe funcfional pigeonhole principle, fhe pigeon and funcfional axioms for a pigeon u faken fo- 
gefher imply fhe polynomial equation Y1vgN{u) = 1 (summing over all holes v £ N{u) fo which 
fhe pigeon u can fly). Since fhis is a degree-1 consequence, if shows fhaf fhe pigeonhole axioms in FPHP 
formulas have lowest possible immunity modulo fhe sef E consisfing of hole and funclionalify axioms. 
Neverfheless, our lower bound proof still works, and only needs expansion of fhe consfrainf-variable 
graph alfhough fhe immunify of fhe consfrainfs is non-exisfenf. 

On fhe ofher hand, fhe consfrainf-variable incidence graph of a random sef of parify consfrainfs is 
expanding asympfofically almosf surely, and since over fields of characferisfic disfincf from 2 parify 
consfrainfs have high immunify (see, for insfance, [GreOO]), fhe fechniques in [AR03] can be used fo 
prove sfrong degree lower bounds in such a selling. However, if seems fhaf our framework of respectful 
boundary expansion is inherenlly unable fo eslablish fhis resulf. The problem is fhaf (as discussed in fhe 
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footnote after Definition 3.6) it is not possible to group variables together in such a way as to ensure 
respectful neighbourhood relations. At a high level, it seems that the main ingredient needed for our 
technique to work is that clauses/polynomials and variables can be grouped together in such a way that 
the effects of assignments to a group of variables can always be contained in a small neighbourhood of 
clauses/polynomials, which the assignments (mostly) satisify, and do not propagate beyond this neigh¬ 
bourhood. Functional pigeonhole principle formulas over bounded-degree graphs have this property, 
since assigning a pigeon u to a hole v only affects the neighbouring holes of u and the neighbouring 
pigeons of v, respectively. There is no such way to contain the effects locally when one starts satisfying 
individual equations in an expanding set of parity constraints, however, regardless of the characteristic 
of the underlying field. 

In view of fhis, if seems fhaf our fechniques and fhose of [AR03] are closer fo being orfhogonal rafher 
fhan parallel. If would be desirable fo gain a deeper undersfanding of whaf is going on here. In parficular, 
in comparison fo [AR03], which gives clear, explicif criferia for hardness (is fhe graph expanding? are fhe 
polynomials immune?), our work is less explicif in fhaf if says fhaf hardness is implied by fhe existence 
of a “clusfered clause-variable incidence graph” wifh fhe righf properties, buf gives no guidance as fo 
if and how such a graph mighf be builf. If would be very interesting fo find more general criteria of 
hardness fhaf could capfure bofh our approach and fhaf of [AR03], and ideally provide a unified view of 
fhese lower bound techniques. 
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